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^ interpretation. The model is applied to a dataset of almost ten thousand deliver- 

^ ies collected in an Italian region. The analysis confirms that standard regression 

i— ' overestimates the impact of education on the child health. With respect to the cur- 

^ rent economic literature, our findings indicate that only high education has positive 

^ consequences on child health, implying that policy efforts in education should have 



benefits for welfare. 
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1 Introduction 



Maternal education is known to drive wedges between infants' health at birth. These 
wedges, generally measured in terms of weight at birth, are powerful determinants of 
health and economic outcomes as adult. A low education may yield effects on the initial 
endowment of an infant's health human capital that tends to be pervasive over the life. 
In particular, economists have argued that inequality at birth may partly be transmitted 
from a generation to the next, with the effect of a lower educational attainment, poorer 
health status, and reduced earning in adult age. Instead of being used as an input, 
birth outcomes have been proved to be directly affected by maternal education. Classical 
references that analyze the impact of maternal characteristics and behaviors on infant 



well-being are Rosenzweig and Schultz (1983), Rosenzweig and Wolpin (1991), and Currie 



and Moretti (2003). 



In this paper, we follow this literature and we propose a model that allows investigat- 
ing whether the higher education can improve health child quality. We complement the 



studies of Almond et al. (2011), Case and Paxson (2011), and many others, which focus 



on approaches for the evaluation of the correlation between education and infant's health. 
The common starting point for our treatment is the hypothesis that this strong correlation 
may partially reflect the influence of unobserved heterogeneity, especially when the birth- 
weight is taken as a proxy of the health outcome. This calls into question the existence 



of important pathways for the effect on health of education. For example, Behrman and 



Rosenzweig (2002) find that parents with favorable heritable endowments obtain more 



schooling for themselves, are more likely to marry each other and this increases health 
of children. In addition, one would expect, for instance, that educated mothers are more 
likely to adopt behaviors (not smoking or drinking, enrich the nutritional intake, etc.) 
that could have a positive impact on birth outcomes. 

Different empirical strategies have been proposed in order to help mitigate problems 
of endogeneity. In an effort to isolate the causal effect of education on birth outcomes. 



one of the proposed empirical strategies uses an instrumental variables approach. Quasi- 
experimental infant health research, focused on primary school construction programs 



(Breierova and Duflo 2004 Chou et al. , 2010) and on college openings (Currie and 



Moretti, 2003), finds the existence of a causal effect, although the observational com- 



parisons may even underestimate the true effect. Unfortunately, not always from public 
data we are able to identify an exogenous shock of interest or valid instruments which af- 
fects education and that may be applied within an instrumental variable setting. Another 
approach is based on panel data that identifies the outcome effects at birth from changes 



in prenatal behavior or maternal characteristics between pregnancies (Rosenzweig and 



Wolpin 1991 Currie and Moretti, 2003; Abrevaya and Dahl, 2008). However, a concern 



about this identification strategy is the presence of feedback effects, specifically those of 
prenatal care in later pregnancies, which may be correlated with education and birth 
outcomes in earlier pregnancies. 

Our paper focuses not exclusively on the key issue of how maternal education causes 
health outcomes, but it also intends to simultaneously investigate other socio-economic 
relationships, such as the marital status on birth outcomes, exploiting the growing avail- 
ability of cross-sectional administrative data. Several approaches to causal inference have 



been proposed in the literature. Among the most known, Neyman (1923) and Rubin 



(1974) provide a definition of causal effects in terms of potential outcomes. We relate 



to graphical models (Lauritzen, 1996) and graphs of influence (Dawid, 2002), which rep- 



resent extensions of path analysis (Wright, 1921). Here, we heavily refer to the Pearl's 



approach (Pearl, 1998 2000 2009 2011) based on the Structural Equation Models (SEMs; 



Wright 1921 Goldberger, 1972 Duncan 1975; BoUen et al. 2008). Pearl extends the 



role of structural models in the econometric literature to take into account the causal in- 
terpretation of the coefficients, so that these models represent mathematically equivalent 



alternative to the potential outcome framework (Pearl, 2011). Note that the Pearl's ap- 
proach also represents an important contact point with other approaches. In this regard. 
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Holland (1986) outlines how the path diagrams, that are commonly used in the SEMs, 



may be also used with the Neyman- Rubin's model. In addition, Lauritzen (1996) and 



Dawid (2002) place the work of Pearl in the context of graphical models and influence 



diagrams, respectively. Within the SEM causal framework, we also evaluate whether the 
educational level has effect on women's probability of being married and at which extend 
it can affect gestational age or birthweight. 

In the following, we base our analysis on a SEM approach that accounts for unob- 
served heterogeneity by introducing a latent background variable. More precisely, we 



propose to apply a latent class approach (Lazarsfeld 1950 Lazarsfeld and Henry, 1968 



Goodman, 1974), where we assume the existence of unobserved groups of individuals that 



are homogeneous with respect to both the cause (i.e., education and marital status) and 



the effect of interest. The resulting model is a special case of finite mixture SEM (Jedidi 



et al. , 


1997 


Dolan and van der Maas 


1998 


Arminger et al. 


1999 


Vermunt and Magid- 



son 



2005), based on a suitable number of consecutive equations in which: (z) unobserved 



heterogeneity is represented by a discrete latent variable defining latent classes of individ- 
uals, (a) the causes may depend on the discrete latent variable and on other covariates, 
and {in) the response variables of interest depend on the causes, on the discrete latent 
variables, and on other covariates. In this way, since the causal effect is evaluated within 
homogenous groups of individuals, it is still possible to read the partial regression coeffi- 
cients in terms of causal effects, as it happens when we adjust for observed confounders 



(Cox and Wermuth, 2004). 



The model is estimated by an Expectation-Maximization (EM) algorithm (Dempster 



et al. , 1977) which is implemented by the authors through a series of R functions, which 



are available to the reader upon request. 
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Our contribution mainly provides new evidence on the estimation of the causal effect 
of maternal education on infant health outcomes, attempting to control for confounding 
factors. We use the population of singleton newborns in a region of Italy (Umbria) from 
2007 to 2009 to examine these differences in the usual outcomes of birthweight and ges- 
tational age. The estimation strategy controlling for unobserved characteristics of the 
mother guarantees that early health outcomes are entirely driven by differences in educa- 
tion and/or marital status. Although the regional sample should have a smaller variability 
in the key variables, we shed light on the previous research question, highlighting that 
more educated mothers give birth to children with higher weight, whereas gestational age 
is not affected by education. More strikingly, we show that these results unequivocally 
arise for mothers that have, at least, an educational degree when unobservable variables 
are taken into account by the introduction of latent classes. Secondly, whereas high school 
and academic qualification have a positive effect on the mother's probability to be mar- 
ried, this family characteristic does not appear to be a significant determinant to explain 
the inequality in birth outcomes, suggesting that these are not important for the effect 
on health. 

The outhne of the paper is as follows. The next section describes the theoretical and 
empirical background. Section 3 illustrates the data and provides preliminary evidences. 
Section 4 discusses our framework for quantifying the causal effect of maternal social 
characteristics on health outcomes at birth and our estimation strategy. In this section, we 
first illustrate the adopted SEM approach and, then, we describe the estimation algorithm 
implemented to maximize the log-likelihood. Section 5 presents our main results, whose 
implications for policy- makers are discussed in Section 6. 
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2 Conceptual framework 



The actual benefits of any public health initiative aimed at reducing inequality at birth 
crucially depend upon the estimates of the causal effect of mothers' characteristics and be- 
haviors and the possibility of intervention by policy-makers. This section starts analyzing 
the hypothesis underlying the well-documented cross-sectional association of education 
and birth outcomes, such as gestational age and birthweight. Then, it discusses some 
insights linked with intermediate and confounding variables focusing on transmission bias 
of unobserved variability across mothers. 

In the following, by = {yn, yi2) we denote the vector of birth outcomes (gestational 
age, birthweight) for each singleton deliver z, i = 1, . . . , n, by Zj = {zn^Zi^) we denote 
the vector of putative causes (mother education, marital status), and by Xi a vector of 
mother-specific characteristics (citizenship, age) other than those included in Zj. These 
characteristics may be associated with y^, but cannot be interpreted in a "causal" sense, 
being not modifiable in principle. Furthermore, we introduce a vector Ui which reflects 
mother-specific unobservable determinants of child outcomes (e.g., genetic factors, unre- 
ported life style behaviors). 

Focusing only on observable variables, a simple multiple linear regression (i.e., through 
the Ordinary Least Square, OLS, estimation method) of on Zj and Xi has the regression 
coefficients referred to the causes in Zj as central parameters of interest for policy pur- 
poses. Thus, if significant, it suggests that mothers may improve the well-being of their 
children with consequences on adult outcomes through interventions that will stimulate 



to attend school for a longer period of time. Currie (2011) reviews the works that study 
the long-term consequences of insufficient health at birth, confirming a strong negative 
association with future performance in terms of schooling attainment, test scores resi- 
dence in high income areas, and wage^ However, ignoring the existence of a potentially 



"'^The economic literature has also shown the existence of selection in mothers subjected to strong 
deprivation with underestimated long-term causal impact on adults results. See the discussion of |Currie| 
(2011) about the selection effects of the second world war in German mothers. See Conti et al. (2010) 



significant vector of omitted variables, Uj, tliat simultaneously may affect child outcomes 
and putative causes, the true effect of Zj results confounded, being the corresponding re- 
gression coefficients under- or over-estimated. One would expect, for instance, that more 
educated mothers are more likely to adopt other behaviors, such as stopping smoking or 
drinking, that could have a positive impact on the child's characteristics. It means that 
the OLS estimator of regression coefficients is biased because of the correlation between 
mother's education and these unobserved variables. 

With respect to the economic literature discussed in the previous section, we provide 
a different strategy of identification and estimation, following the SEM approach. We 
use a number of recursive concatenated equations for constructing a prediction of birth 
outcomes given education. Here, we anticipate some features that make our approach 
suitable for the empirical test using cross-sectional administrative data. 

First, the evidence that the causes of prematurity are less well- understood with respect 
to those of low birthweight does not imply that the significant strong correlation between 



the infant health outcomes should be obscured. Unlike Almond et al. (2005), we do not 
propose to model the birthweight as "caused" by duration of gestation because if, on 
the one hand, it explains large part of the overall variance, on the other hand, reverse 
causality may emerge given that lower infant weight during gestation may be a cause for 
preterm delivery. Therefore, we model an associative rather than a causal relationship 
between birthweight and gestational age. 

Differently from the conventional empirical specifications, the estimates of the effects 
of determinants on infant health outcomes are not carried out by establishing thresholds of 
premature birth (e.g., gestation less than 37 weeks) and low birthweight (e.g., birthweight 
of infants less than 2500 grams). Indeed, the related economic costs have been found 
non-linearly significant across the distribution of outcomes with peaks at very preterm 
delivery and low end of the distribution. Following the aim of the paper, it is appropriate 



for a recent discussion of women selection into higher education. 
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to use continuous response variables to evaluate infant inequalities. 

Second, as an important channel through which education may affect infant health 
outcomes, we also examine whether education affects the condition of a mother to be 
married. There are relatively few available theoretical models of marriage markets with 



pre-marital investment in education. A recent work in this topic is given by Weiss et al. 



(2009), which focuses on the role of background characteristics on the probability to be 
matched with skilled partner. On the other hand, a strand of the empirical literature 



focuses on the effect of assortative mating^ finding a positive impact by education (Pen- 



cavel, 1998; Qian, 1998). Here, we directly model the probability to be married of mothers 



with different educational level, leaving out heterogeneity that concerns assortative mat- 
ing. Indeed, this relationship has itself an interpretation within the market marriage 
theory. On the other hand, studies on assortative mating based on own education as well 
as background characteristics may be difficult to interpret, as the education distributions 
for men and women in the population differ. As a result, the educational level should in- 
crease the probability that the mother is married at the time of the birth because at least 
the production of household public goods determines gains from marriage as, for example. 



a "high quality" child (Becker, 1985). With respect to this literature, the potential gains 
from the household specialization have been recently reduced in the developed countries, 
given a growing participation of women in the labor market and marriage postponement 
of more educated people. The disencentive to marry has gone hand-in-hand with an 
increase in cohabitation, which work as much as marriage. 

Third, a major difficulty in many specific observational studies concerns whether all 
appropriate background variables have been included in the model to ensure that the 
relevant regression coefficients capture the causal effect of Zj on y^, so that the term 

"cause" is appropriate for Zj. For example, it is known that mothers of some ethnic groups 

^Thc literature on assortative mating addresses the question of who marries whom, as well as who 
marries and who remains single (Becker 1981). Positive assortative mating on a certain characteristic 
means that individuals tend to match with partners who are similar with respect to that characteristic. 
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could be predisposed, coeteris paribus, to give birth to children with higher birthweight. 
Since this large weight is statistically independent from education, this implies that causal 
estimation of the effect of education on well-being of infant should control for factors that 
directly or indirectly cannot be influenced by policy interventions. It is known that the 
main limitation of many observational studies is the possible existence of unobserved 
confounders whose omission seriously distorts the interpretation of the dependence of 
interest, also when we control for many observed variables. To solve this problem, we 
extend the SEM approach to a framework with latent classes potentially able to account 
for unobserved heterogeneity. 

Fourth, in the proposed approach intermediate variables are marginalized out, lead- 
ing to their exclusion from the analysis. For instance, intra-uterine growth retardation 
(lUGRj^is a measure of inequality during the pregnancy, that anticipates infant outcome 
at birth and is a serious candidate for being an intermediate variable. Indeed, as a main 
source of variation of birthweight, including lUGR is irrelevant to detect the potential 
total causal impact of mother's characteristics on infant health outcomes. We may also 
include, in this set of intermediate results of education that may not be valid for pre- 
dicting the investigated causal effect, if mothers smoke or not, a characteristic found to 
adversely impact on birthweight]^ along with prenatal care visits and parity, as discussed 
in 



Currie (2011). 



To summarize, in our theoretical model we assume that age and citizenship (both 
of them included in the vector Xi) are attributes of women that are not modifiable, 
educational level (zn) may have a causal effect on marital status {zi2) and both educational 
level and marital status may have a causal effect on gestational age (yn) and birthweight 
{yi2)- Moreover, we assume that gestational age and birthweight are inequality indicators 

with a likely high level of association, but without a specific causal relationship. 

■^See Kramer ( |1987 ) or Almond et al. (2005) for a deepen discussion in using lUGR as a proxy of 

inequality in health child. 
^See 



Currie and Moretti 



(2003); 



Almond et al. 



(2005); Abrevaya and Dahl (2008). 
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In the following, after describing the dataset used for our analysis, we will return to 
the discussion of the proposed approach with more technical details. 

3 Data and preliminary analyses 

In this section we first describe the dataset used for our study, presenting some preliminary 
analyses that support significant correlations between the key covariates, such as education 
and marital status, and outcomes at birth. 

3.1 The dataset 

The study is based on data obtained from the Standard Certificates of Live Birth (SCLB) 
collected in Umbria (Italy) in 2007, 2008, and 2009. The SCLB is filled in within ten 
days after the delivery by one of the attendants the birth (e.g., doctor, midwife) and 
it provides information on infants' and mothers' characteristics. These data concern 
socio-economic and demographic characterisic of each mother giving a newborn including 
maternal age, citizenship, educational attainment, marital status, childbearing history, 
prenatal care, and geographic residence, whereas, as anticipated in Section |2| the dataset 
does not contain the smoking habits before and during the pregnancy. The information 
on the father is more sparse and includes sex, citizenship, and education. Linked with the 
newborn, information include gestational age, birthweight, and pluriparity. The available 
dataset contains information about over than 25,000 women. For our study we limited our 
attention to natural conceptions (i.e., without assisted fertilization methods), primiparous 
women, and singleton births; moreover, only infants with a gestational age of at least 23 
weeks and a birthweight of at least 500 grams are taken into account. The total sample 
size that merges each mother and her baby amounts to 9,005 records. 

Definitions and some descriptive statistics for the variables used in the successive 
analysis are shown in Table [T| As already mentioned, the main attention is focused on 
the possible effect of maternal social characteristics on the inequalities in gestational age 
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and in birthweight of infants. Both variables result normally distributed (Figure |2])[j in 
average, the delivery takes place after 39.310 weeks of gestation (standard deviation equal 
to 1.686) and the mean weight at birth is equal to 3.262 kg (with a standard deviation 
equal to 0.487 kg). The two variables present a positive and intermediate correlation 
(p = 0.56), as confirmed by the scatter plot in Figure |3} 

Commenting Table [l| we observe that women are on average 30 years old; the 80% is 
Italian, whereas the 13% comes from East-Europa. In addition, more than one half (52%) 
has a high school diploma, followed by a 28% with a higher educational level (degree or 
above); the remaining 20% of women attained at most a compulsory educational level. 
The 70% of women is married. As a limitation, our dataset does not contain information 
about cohabitation but includes it on the category "not married"; this may imply a bias 
of the effect of marital status on child's outcomes, under-estimating the positive effects 
of the production of public goods in the household. 



Variable 


Category 


% 


Mean 


St.Dev. 


Gestational age (weeks) 
Birthweight (kg) 






39.310 
3.262 


1.686 
0.487 


Age (years) 






30.040 


5.288 


Citizenship 


Italian 
east-Europe 
other citizenship 


80.1 
12.6 
7.3 






Education level 


middle school or less 
high school 
degree and above 


19.8 
51.9 
28.4 






Marital status 


married 
not married 


70.0 
30.0 







Table 1: Distribution of variables 

Lastly, in order to account for other potential effects on infants' weight, we report a 
box-plot of birthweight by birth month. The graph in Figure [T] shows a small variability 
of birthweight over the year and it corroborates the results obtained by an Anova test, 
allowing to strongly reject the hypothesis that winter babies tend to be smaller than the 



ones born in summer months, discussed in the literature by Torche and Corvalan (2010). 



^These results are also in line with kernel density estimates. 
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Figure 1: Box plots for birthweight by birth month 
3.2 Multiple regressions 

Through multiple linear regressions performed separately for gestational age and birth- 
weight, it is possible to discover some significant associations with the covariates of Table 
[TJ As shown in Tables |2] and |3] respectively, the number of gestational weeks and the 
weight of the infant tend to decrease as the woman age increases. As concerns the ges- 
tational age, we observe a significant association with citizenship, with women coming 
from foreign countries tending to deliver before Italian women. No other significant effect 
emerges with for the other covariates. 

Some differences exist in the associative relations between birthweight and the co- 
variates. The statistical effect of citizenship is significant at 10% level for east-European 
women and at 5% level for other citizenships (with respect to Italian citizenship); the 
birthweight results greater for east-European women with respect to Italians and smaller 
for the other citizenships. We also observe a significant association between birthweight 
and educational level: as expected, the higher the educational level, the greater the birth- 
weight. Finally, some evidence of association is observable for the marital status: not 
married women give birth to infants with a lower weight with respect to married women. 
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Gestational age distribution 



Birthweigth distribution 




Figure 2: Histogram and normal curve for gestational age (left) and birthweight (right) 
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Figure 3: Scatter plot: birthweight versus gestational age 



The results obtained by the multiple linear regressions suggest to proceed in the anal- 
ysis investigating the causal relationships between variables. For this aim, we adopt a 
SEM framework that will be illustrated in the next section. 
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covariate 


category 


est. 


s.e. 


if: stat. 


p- value 


intercept 


- 


39.325 


0.051 


772.686 


0.000 


age 


- 


-0.019 


0.004 


-4.910 


0.000 


age^ 


— 


-0.001 


0.001 


-1.336 


0.181 


citizenship 


Italian 


0.000 


— 


— 


— 


citizenship 


east-Europa 


-0.242 


0.059 


-4.099 


0.000 


citizenship 


other citizenship 


-0.208 


0.072 


-2.887 


0.004 


education 


middle school or less 


0.000 








education 


high school 


0.077 


0.049 


1.551 


0.121 


education 


degree or above 


0.077 


0.057 


1.345 


0.179 


marital 


married 


0.000 








marital 


not married 


-0.025 


0.039 


-0.640 


0.522 



Table 2: Regression results for the gestational age 



covariate 


category 


est. 


s.e. 


t stat. 


p- value 


intercept 




3.240 


0.015 


220.413 


0.000 


age 




-0.005 


0.001 


-4.159 


0.000 


age^ 




-0.000 


0.000 


-0.875 


0.381 


citizenship 


Italian 


0.000 








citizenship 


east-Europa 


0.032 


0.017 


1.847 


0.065 


citizenship 


other citizenship 


-0.050 


0.021 


-2.414 


0.016 


education 


middle school or less 


0.000 








education 


high school 


0.032 


0.014 


2.243 


0.025 


education 


degree or above 


0.050 


0.017 


3.033 


0.002 


marital 


married 


0.000 








marital 


not married 


-0.019 


0.011 


-1.682 


0.092 



Table 3: Regression for the birthweight 

4 Proposed approach 

In the following, we resume the conceptual framework described in Section |2] and we 
describe in more detail the resulting statistical model which is used to analyze possible 
effects of social characteristics of women on the inequalities in gestational age and birth- 
weight of infants. We first present some preliminary concepts about causality. Secondly, 
we describe the adopted SEM, which is distinguished with respect to standard SEMs for 
two main elements: (i) it is based on the presence of a discrete, rather than continuous, 
latent variable and (ii) it accommodates for any type of responses rather than only for 
continuous responses. After having described the structural equations of the proposed 
model, we illustrate maximum likelihood estimation of their parameters. 
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4.1 Preliminaries on causality 



One of the main reasons of systematic bias, which may lead to a wrong causal analysis 
and is commonly encountered in observational studies, is the presence of confounding 



effects that are ignored during the analysis (Hernan and Robins, 2012). We speak about 



confounding effect (Freedman, 1999) when two variables z and y have a common cause 



u, that confounds the true relationship between the putative cause z and the effect y 
(Figure |4] panel (a))- A typical situation takes place when a strong observed association 
between variables z and y may be explained partly or completely by controlling for the 
common cause u. On the other hand, we can also encounter the opposite situation when 
the true causal relationship between z and y is balanced and cancelled (so that z and y 
result statistically independent) by a relationship of equal strength but opposite sign due 
to the common cause u. In all these cases, it is important to take explicitly into account 
all the possible sources of heterogeneity of y, in order to avoid confounding effects. 



(b) 



(d) 



Figure 4: Causal relation between z and y and presence of a third variable u: (a) u as 
common cause, (b) u as intermediate effect, (c) u as common effect, (d) u as cause acting 
independently from z 



Before proceeding, it is useful to point out that the presence of a common cause 
of both z and y is the main situation that requires a statistical adjustment. In fact, 
there are several situations that may confuse the researcher by leading to unnecessary or 
improper adjustments for a third variable. One case is encountered when variable u has 
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an intermediate effect along tlie patliway from z to y (Figure |4] panel (b)). Controlling 
for u may be advised if we are interested in the direct effect of z on y (i.e., that part of 



effect not mediate through u), although it could be quite problematic (Freedman, 1999 



Greenland et al. , 1999 Cox and Wermuth 2004). On the other hand, adjustment is 



useless if we are interested in the total effect of z on y. As outlined in Section |2| this is 
the case with some variables in our study, such as lUGR. 

Another situation takes place when both z and y have a common effect u (Figure |4 



panel (c)): as outlined by Greenland et al. ( 1999 ), in this case adjustment for u "is not only 



unnecessary but irremediably harmful", as it would result in the so-called adjustment- 
induced bias. 

A final naive situation is illustrated in Figure |4] panel (d), where two causes of y 
act independently one another (as shown through the missing edge between z and u). 
Therefore, if our interest is limited to the causal effect of z on y, ignoring u has no 
consequences. 

4.2 Preliminaries on structural equation models and extensions 

As mentioned in Section [T| an useful statistical instrument to control for confounding 



bias is represented by SEMs (Wright, 1921; Goldberger 1972 Duncan 1975; BoUen et al. 



2008). As discussed by Pearl (1998 2009 2011) and shown in detail by Cox and Wermuth 



(2004), the partial regression coefficients of a SEM can be appropriately interpreted in 



terms of causal effects on the response variable, given that all the relevant background 
variables have been included in the model. This point represents one of the major difficul- 



ties for the causal analysis. Indeed, as outlined by Muthen (1989), after having controlled 



for the observed covariates, the residual unexplained heterogeneity in the sample may be 
still substantial. Two main solutions have been proposed in the literature to treat this 
problem: (i) one is based on the introduction of a continuous latent effect that assumes 



different parameters at individual level (Ansari et al. , 2000) and (ii) the other one, which 
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is of interest in our contribution, is based on the assumption that the unobserved het- 
erogeneity may be captured by a hmited number of (unobserved) groups or classes of 
individuals. This latter approach is known as finite mixture SEM and it was introduced 



independently by Jedidi et al. (1997), Dolan and van der Maas (1998), and Arminger et al. 



(1999). See also Vermunt and Magidson (2005) for a clear illustration of different aspects, 
such as model specification without and with covariates, estimation, and model selection. 



and see Muthen (2002) for a wide overview and classification of different types of SEM. 
More precisely, in a finite mixture SEM a different SEM may be specified for each mixture 
component, so that different components are allowed to have different parameter values 
and even different model types. In particular, we introduce in each structural equation 
a discrete latent variable u, the distribution of which is based on K support points with 
specific mass probabilities. In this way, u represents a common cause for all the responses. 
Moreover, the model we propose is configured as a special case of finite mixture SEM. In- 
deed, we assume that the K latent classes differ one another for different intercepts, while 
the functional form of each regression equation and the values of structural coefficients 
are assumed to be constant among the classes. 

With respect to standard SEMs that are based on continuous latent variables, the 
extension to the finite mixture approach presents some advantages. Firstly, each mixture 
component identifies homogeneous classes of individuals that have very similar latent 
characteristics, so that, in a decisional context, individuals in the same latent class will 



receive the same treatment (Lazarsfeld 1950 Lazarsfeld and Henry, 1968 Goodman 



1974). Moreover, this assumption allows to estimate the SEM in a semi-parametric way. 



namely without formulating any parametric assumption on the latent variable distribu- 
tion. 

Another useful extension of the standard SEM approach, that is considered in the 
present papers, derives from the observation that, in their original formulation, SEMs are 
based on continuous observed variables, so accommodating only a few types of apphcation. 



17 



A more general framework is obtained by adopting a generalized formulation (Skrondal 



and Rabe-Hesketh , 2004, 2005 BoUen et al. , 2008), which allows to take into account 



mixed types of response, that is both continuous and ordinal or binary observed responses, 
in the same set of structural equations. In this regards, since among the putative causes we 
have categorical variables for the educational level (with categories: 1 for middle school or 
less, 2 for high school, and 3 for degree or above) and the marital status (with categories: 
for married, 1 for not married), we introduce a latent continuous variable z*i underlying 
each observable variable zn. In particular, we assume that 

Zii = Gi{z*i), 

where G/(-) is a function which may depend on specific parameters according to the 
different nature of zn. We consider the following cases: 

• when the observed response is of a continuous type, an identity function is adopted, 
that is Gi{z*i) = zl; 



when the observed response is binary (i.e., zn = 0, 1), then 

= ^{4 > 0}, 



(1) 



where /{■} is the indicator function assuming value 1 if its argument is true and 
zero otherwise; 



when the observed response is ordinal with categories j = 1, 
set of cut-points > . . . > ti^Ji-\ and we define 



J/, we introduce a 



r 1 4<-^u, 

2 -r;i < zl < -ri2, 



(2) 



4.3 The proposed finite mixture SEM 

In the following, coherently with the theoretical model illustrated in Section |2] and in 
accordance with the notation previously introduced, the assumed causal relationships 
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among the considered variables are expressed through a SEM composed by 3 equations 
for each of the n singleton delivers in the dataset. Two of these equations are referred 
to the causes educational level {zn) and marital status (2:42), whereas the third one refers 
to the two correlated birth variable outcomes: gestational age {yn) and birthweight (yi2)- 
Moreover, the vector Xi is composed by observations on the variables age and squared age 
(both centered with respect to the mean value), and on dummies referred to citizenship 
(Italian is the reference category). 

For every deliver z, i = 1,...,?7,, the generalized linear structural equations are as 
follows: 

• Equation 1 (educational level): we assume that zn = Gi{z*i), with Gi defined as in 

and 

4i = 1^1 + «ii + x'i(3^ + en, (3) 

where /xi + an is a specific intercept for subject i, f3i is a vector of regression 
coefficients for the covariates in Xi, and en is a random error term with logistic 
distribution; 

• Equation 2 (marital status): we assume that Zi2 = G2{z*2), with G2 defined as in 
(0) and 

z*2 = 1^2 + ai2 + x[f32 + z'n'y + ei2, (4) 

where /X2 + «i2 is the subject specific intercept, f32 and 7 are regression coefficients, 
and ei2 is an error term with logistic distribution, which is independent of en] 

• Equation 3 (gestational age, birthweight): in this case the observable variables, that 
we collect in the vector j/j = {yn, 1/12)', are continuous and then we directly assume 
that 

y. = u + 6i + ^Xi + + T7^, (5) 

where u = {1^1,1^2)', Si = (5ii,5i2)', $ = (01,02)'^ * = {'^ii'^2)'i and 77 = (771,772)'; 
the last is a vector of error terms, which is assumed to follow a bivariate Normal 
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distribution centered at and with variance-covariance matrix S and to be inde- 
pendent of the previous error terms. Here, we have two subject-specific intercepts, 
that is ui + 6ii for the gestational age and z/2 + 6i2 for the birthweight. Accordingly, 
we have specific regression coefficients which are collected in (p^ and i/?^ for the first 
response variable and in 02 "02 foi' the second one. 

Commenting the above equations, we note that the parameters of most interest in 
the present causal analysis are the regression coefficients in ^. Moreover, concerning the 
individual-specific parameters an, ai2, and Si, we clarify that, due to the latent class 
assumptions described in the previous section, these parameters have a discrete distribu- 
tion with K support points and corresponding probabilities (or weights). In particular, 
for the k-th class, with k = 1, . . . ,K, the support points for an and ai2 are denoted by 
C,ki and respectively, whereas the vector of support points for <5j is denoted by Cki ^^e 
corresponding weight is denoted by iTk- Support points and class weights are estimated 
on the basis of the data, together with the other parameters involved in the previous 
structural equations. 

In order to make the model identifiable, the support points are suitably constrained by 
fixing their (weighted) mean at 0, so that /ii, /i2, and u have the role average of intercepts. 
Note that, in order to make the model identifiable, we also constraint the first cutpoint 
involved in the second equation (r2i) to be equal to 0. 

Finally, about the error terms involved in the three equations, we note that ei and 62 
are assumed to have a logistic distribution. Then, due to the specified Gi{-) functions, a 
global (or cumulative proportional odds) logit parameterization, used in the proportional- 



odds model of McCuUagh (1980), results for the conditional distribution of zn, whereas 



a standard logit parametrization results for Zi2; see also Agresti (2002). In fact, we have 
that 

^^S—r —I r = I2i + Tij^i + an + Xi(3^, j = 2,3, (6) 
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and 

p{zi2 = 0\ai2,Xi,Zii) 

Note that, we could also assume that both ei and 62 have a Normal distribution, resulting 
in a parametrization based on probits and ordered probits, but this would have small 
impact on the model specification, while making the estimation more complex. On the 
other hand, given the nature of the variables in y^, it is obvious to assume a bivariate 
Normal distribution in the third equation for y^. Note, in particular, that this distribution 
depends on the variance-covariance matrix 

erf 0-12 
(721 

where af is the variance of yi, a 2 is the variance of y2i and (T12 = (J21 is the covariance 
between yi and y2- All are free parameters and then we allow a free correlation, which is 
measured by the index 

P = 



between the two variables, even given the observable and the unobservable variables. 
4.4 Model estimation 

We perform estimation of the parameters of the model previously introduced by the 
maximum likelihood method. This requires to derive the joint distribution of (zji, Zj2, Vi)-, 
that is the conditional distribution of these variables given ajj, once the individual-specific 
parameters an, ai2, and <5j have been integrated out. This distribution may be expressed 



finite mixture, that is 

K 

f{zii,Zi2,yi\xi) = ^iTkp{zii\aii = ^ki,Xi)p{zi2\ai2 = ^k2,Xi, Zii)f{yi\Si = C,k^Xi,Zi), 

k=l 

where the probabilities mass function for zn and Zi2 are defined through ^ and ([T]), 
whereas /(t/J<5j = Ck^ ^i) is the density function, computed at y^, of a bivariate normal 
distribution with mean u + 6i + ^Xi + ^Zi and variance-covariance matrix S; see equation 
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Under the assumption that the sample units are independent each other, the log- 
hkehhood of the model to be maximized for the estimation is 



(0) = ^^Ogf{Zii,Z,2,yi\Xi 



i=l 



with 6 denoting the vector of all model parameters. Maximization of i{0) may be ef- 



ficiently performed through an Expectation-Maximization (EM) algorithm (Dempster 



et al. , 1977). In the following, we sketch this algorithm, referring to more specialized 



papers on the topic; see, for instance, Bartolucci and Forcina (2006) and the references 



therein. Moreover, as already mentioned, for the present application we implemented the 
algorithm in a set of R functions that we make available to the reader upon request. 

The EM algorithm is based on the so-called complete data log-likelihood, which could 
be computed if we knew the latent class to which every sample unit belongs. This function 
may be expressed as 

K n 

k=l 1=1 
K n 

+ ^^tffcilogTTfc, (8) 

k=l 1=1 

where Wik is a dummy variable equal to 1 if unit i belongs to subject k and to other- 
wise. Based of this function, the EM algorithm alternates the following two steps until 
convergence in i{0): 

• step E: compute the conditional expected value of i*{6) given the observed data 

and the current value of the parameters in 0; this is equivalent to substituting every 

dummy variable Wik in ([s]) by the corresponding conditional expected value 

. _ 'n-kp{zii\aa = ^ki,Xi)p{zi2\ai2 = Ck2,Xi, Zii)f{yi\6i = Ck,Xi,Zi) . 

f{zii,Zi2,yi\xi) 

• step M: maximize the expected value of i*{0) obtained above with respect to the 
model parameters; to update the class weights we have an exphcit solution given by 

nk = ^^^^^, k = l,...,K. 
n 
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Moreover, for the parameters involved in ([s]) and (|4]) we use simple iterative algo- 
rithms that are currently used to maximize the weighted log-likelihood of a propor- 
tional odds model (McCuUagh, 1980), whereas the parameters in equation ^ are 
updated by solving a weight least square problem. 

The value of at convergence of the EM algorithm is taken as the maximum likelihood 
estimate of this parameter vector, denoted by 0. To treat the well-known problem of 
multi-modality of likelihood characterizing finite mixture and latent variable models, we 
suggest to initialize the estimation algorithm by both deterministic and random starting 
values. Finally, standard errors for the parameter estimates are obtained by inversion of 
the observed information matrix, which is numerically obtained from the score function 



that, in turn, is obtained by exploying a result due to Oakes (1999) 



5 Results 



In the following, we illustrate the results obtained through the finite mixture SEM pre- 
sented in the previous section and applied to the dataset about the 9,005 newborns col- 



lected in the Region of Umbria; see Section 3.1 for a description of the dataset. Firstly, 



we give the results about the selection process of the optimal number of latent classes 
(Table [i]). Secondly, the estimated regression coefficients are reported separately for each 
structural equation in Tables [5} |6} and [7j Finally, the adopted latent structure is de- 
scribed in Table |8} which shows the estimated support points for each latent class and 
the corresponding weights. 

First of all, in applying the proposed finite mixture approach, the choice of the op- 
timal number of latent classes K is of crucial importance. For this aim, several studies 



(Roeder and Wasserman 1997 Dasgupta and Raftery 1998 McLachlan and Peel 2000) 



conclude that the Bayesian Information Criterion (BIG; Schwarz, 1978) present an ade- 
quate performance for choosing K. We remind that this criterion is based on penalizing 
the maximum value of the log-likelihood £ by a term depending on the number of free 
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parameters (#par) and on the sample size (n): 

BIG = -2£ + log(n)#par. 

In practice, we fit the adopted SEM with increasing K values, relying the choice of optimal 
K on the value just before the first increasing of the BIG index. On the basis of results 
shown in Table |4| we obtain the minimum BIG value in correspondence of if = 3 latent 
classes. 



K 




#par 


BIG 


i 


-35700.768 


32 


71692.914 


2 


-34536.422 


37 


69409.750 


3 


-34488.589 


42 


69359.610 


4 


-34467.548 


47 


69363.055 



Table 4: Results from the preliminary fitting: number of mixture components (K), maxi- 
mum log-likelihood (I), number of parameters (jj^par), and BIC index 



We now consider the estimates for the structural equations Q and (|4]) about edu- 
cation and marital status, respectively. In particular, we observe that the educational 
level significantly increases with the age of the woman and it is lower for foreigners with 
respect to Italians (Table [s]). Effects of age and citizenship are also highly significant with 
reference to the marital status (Table |6|: the probability to be married increases with 
age and it is higher for foreign women. Moreover, a causal effect of educational level on 
marital status is detected after controlling for the latent classes: higher the educational 
level, higher the probability to be married. 



covariate 




category 


est. 


s.e. 


t stat. 


p- value 


intercept {fii] 


) 




2.053 


0.039 


52.285 


0.000 


1st cutpoint 1 


in) 




0.000 








2st cutpoint 1 






-2.695 


0.031 


-20.780 


0.000 


age 






0.103 


0.004 


23.405 


0.000 


age^ 






-0.009 


0.001 


-14.587 


0.000 


citizenship 




Italian 


0.000 








citizenship 




east-Europa 


-0.806 


0.069 


-11.712 


0.000 


citizenship 




other citizenship 


-1.100 


0.086 


-12.780 


0.000 



Table 5: Regression results for educational level 
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covariate 


category 


est. 


s.e. 


t stat. 


value 


intercept (/i2) 


- 


-0.763 


0.065 


-11.313 


0.000 


age 


- 


-0.027 


0.005 


-5.487 


0.000 


age^ 


— 


0.008 


0.001 


12.381 


0.000 


citizenship 


Itahan 


0.000 


- 


- 


- 


citizenship 


east-Europa 


-0.679 


0.082 


-8.264 


0.000 


citizenship 


other citizenship 


-0.677 


0.101 


-6.701 


0.000 


education 


middle school or less 


0.000 








education 


high school 


-0.152 


0.064 


-2.375 


0.018 


education 


degree or above 


-0.468 


0.076 


-6.123 


0.000 



Table 6: Regression results for marital status 



We now analyze the results shown in Table [7] about the outcome variables, according to 
the formulation of equation Both gestational age and birthweight decrease as mother's 
age increases, while the association with citizenship is something different. Women from 
east-Europa deliver significantly before Italians, but their newborns have a higher weight. 
On the other hand, women from other countries present differences on both response 
variables with respects to Italian women. 

By comparing Table [7] with Tables [2] and [3} it is possible to draw some conclusions 
about the causal effects of educational level and marital status on the birth outcomes. 
About the marital status, the analysis confirms the absence of any causal effect. With 
reference to the educational level, the increase of p-values denote the presence of a con- 
founding effect. However, even after controlling for a latent common cause, a significative 
effect persists on the birthweight: a higher educational level causes a higher birthweight. 
Note that the causal effect is well defined for the degree or above modality, with a p- value 
equal to 0.014, while this is not longer true for mothers with a high school. Moreover, we 
observe that correlation between outcomes is not enough large (pi2 = 0.45). Given that 
gestational age and birthweight are conditioned to covariates and latent variables, this is 
compatible with causes affecting in a different way the two outcomes. 

We now analyze the results concerning the latent structure of the model (Table |8]). 
As mentioned above, K = 3 different latent classes are detected. Note that the estimated 
]9- values refer to the comparisons between classes 2 or 3 versus class 1. The most repre- 
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response var. 


covariate 


category 


est. 


s.e. 


t stat. 


p- value 


Gestational age 


intercept {ui) 


— 


39.346 


0.044 


905.935 


0.000 


age 


— 


-0.015 


0.003 


-4.789 


0.000 




age^ 





-0.001 


0.000 


-2.544 


0.011 




citizenship 


Italian 


0.000 











citizenship 


east-Europa 


-0.194 


0.049 


-3.942 


0.000 




citizenship 


other citizenship 


-0.112 


0.060 


-1.855 


0.064 




education 


middle school or less 


0.000 


- 


- 


- 




education 


high school 


0.025 


0.042 


0.608 


0.543 




education 


degree or above 


0.029 


0.049 


0.600 


0.548 




marital 


married 


0.000 


- 


- 


- 




marital 


not married 


0.025 


0.033 


0.749 


0.454 


Birthweight 


intercept (1/2) 


— 


3.238 


0.017 


195.392 


0.000 


age 


— 


-0.004 


0.001 


-3.863 


0.000 




age^ 





-0.000 


0.000 


-1.708 


0.088 




citizenship 


Italian 


0.000 










citizenship 


east-Europa 


0.041 


0.016 


2.653 


0.008 




citizenship 


other citizenship 


-0.031 


0.019 


-1.608 


0.108 




education 


middle school or less 


0.000 










education 


high school 


0.023 


0.014 


1.674 


0.094 




education 


degree or above 


0.043 


0.017 


2.462 


0.014 




marital 


married 


0.000 










marital 


not married 


0.011 


0.012 


0.904 


0.366 



variance of gestational age (af) 1.776 

variance of birthweight ((t|) 0.171 

covariance (0-12) 0.248 

correlation (P12) 0.450 



Table 7: Regression results for the gestational age and the birthweight 



sentative class is the first one, to which corresponds a weight equal to tti = 0.931, whereas 
the remaining part of women results assigned to the third class with tts = 0.041 and to 
the second class with 772 = 0.028. 





k = l 


k = '2 


k = Z 


education (^,^1) 
marital status (^^2) 
gestational age (Cki) 
birthweight (Cfc2) 


0.005 
0.026 
0.178 
0.005 


-0.165 (0.234) 
0.289 (0.081) 
-6.086 (0.000) 
-1.245 (0.000) 


-0.005 (0.964) 
-0.794 (0.021) 
0.123 (0.671) 
0.728 (0.000) 


class weight (vrfc) 


0.931 


0.028 


0.041 



Table 8: Class weights and support points estimates (p-values in parenthesis are referred 
to the comparison between the second and third class with the first class) 



With a weight greater than 0.90, women belonging to class 1 represent the main part of 
the population, so that no particular difference results with respect to the average values 
of the entire population, as shown by the estimated support points which very close to 
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zero. 

Differently, women from class 2 result to be well characterized. With a support point 
equal to 0.289, to which correspond an odds equal to exp(0.289) = 1.335, women in class 
2 present a significant higher propensity (at 10% level) to be not married with respect to 
women in the first class. On average they give birth 6.1 weeks before and their infants 
weigh 1.245 kg less. No significant difference results about the educational level. Finally, 
women in class 3 have a higher tendency to be married (odds equal to exp(— 0.794) = 0.452 
for not being married) and the birthweight of her infants is significantly higher (+0.728 
kg with respect to the first class). On the other hand, no significant difference results 
with respect to educational level and gestational age. 

Our estimates allow us to assess potential heterogeneity in infant health outcomes of 
educational attainments across mothers. We focus our attention on birthweight because, 
differently from recent findings of this literature, education does not find significant esti- 
mates for gestational age. The estimates suggest that higher education increases birth- 
weight by a significant, although quantitatively not large, amount: a graduate mother 
increases, on average, of 43 gr the newborn weight with respect to mothers with a ba- 
sic educational level. This impact on birthweight is more limited for mothers that only 
achieve a high school level; it mostly cast some doubt if the estimated parameters may 
or may not impact on child health, because the significance is between 5% and 10% level. 
Three facts stand out from our results, as we explain in the following. 

First, the finite mixture SEM estimates for the effect of education on birthweight 
are lower than standard regression ones (and with higher p- values), a finding common to 



studies similar to ours ( Abrevaya and Dahl, 2008 ), and in line with the a priori that upward 
bias in regression estimates are driven by the correlation between mother's education and 
unobserved variables. 

Second, the finite mixture SEM suggests a significant and positive effect of education 
on the probability to be married. Specifically, non-completing high school or degree would 
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lower the mother probabihty to be married of about 14% and 37%, respectively. While the 
range of variation of these estimates may reflect the specificity of our sample, we note that 
in most cases they are against the estimates reported in this specific literature. See, for 



example, Breierova and Duflo (2004) for a study in Indonesia and Lefgren and Mclntyre 



(2006) for a study in the US. As known, based on catholic background of marriage, the 
postponement of mother with higher education in Italy does not influence their propensity 
towards traditional cohabitation. 

Third, the estimated effect of the latent classes stems qualitatively almost exclusively 
from differences between married and not married mothers or, alternatively, within the 
married group. 

Different arguments may help to explain the first result. The prevalent interpretation 
is that the woman's educational level may be related with specific unobservable variables, 
such as the ability to properly manage the pregnancy so as to improve the health level 
of the newborn. This implies that an upward bias emerges in the regression estimates, a 
result consistent with the findings of Table 4. On the other hand, unlike [Abrevaya and 



Dahl (2008), we find a greter effect on birthweight for mothers with the highest level 



of education (with respect to women with a high school degree), a result qualitatively 



in accordance with Currie and Moretti (2003 Note that, if birthweight responds to 
educational level, conditionally on to be married or not, unobservable factors linked with 
her husband may further affect the results, although latent classes have been identified. 
We will return to this discussion below, when we attempt to identify a latent class by a 
positive assortative mating within married mothers. 

Despite a growing literature, little is known about to the causal effect of the woman's 
educational level on her marital status. Our estimates indicate that the education level 

has a positive impact on the marriage probability, especially with an educated potential 

^However, the recent literature that use laws affecting the compulsory schooling of high school educated 
mothers has not shown a positive impact on birthweight ( ]Lindeboom et ah 2009 McCrary and Royer 
20TT1). 
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partner. This is clearly not in line with the results of Breierova and Dufio (2004), in 
which an increase of education does not lead to a significant impact on the probability 
of a woman being currently married. Although there are several respects in which the 
increase in the marriage chance is consistent with a higher level of education, we can 
conjecture that our data are in line with the hypothesis that this effect is mitigated by 
the specificity of the Italian labor market. The job participation rate of the mothers in 
our sample is close to that of the national mean of about 45%, an index distant of more 
than 10 percentage points from the average of the EU countries and far from the aims 
of European Pact for gender equalitjj^ In addition, this is heterogeneous across levels of 
education. In fact, if we refer to mothers with at least a degree before the first pregnancy, 
the Italian participation rate is even much lower. In contrast to the classical predictions in 



terms of incentive marriage (Lam, 1988), this low participation rate of women may induce 
potential gains from household specialization. The result reinforces the known evidence 
that more educated mothers use the status to be married to postpone the job search or to 
avoid to select jobs that are not in accordance to their individual expectations, a finding 
not in contrast with the stylized fact that, over the life cycle, more educated mothers have 
on average a higher participation rate. 

By acquiring the highest educational level, a mother can affect the identity of his future 
husband. However, the identification of the effect of education on the probability to be 
married and outcomes at birth is complicated by the endogeneity involved in the partner's 
educational level, that may affect mothers' choices and outcomes; that is, marriages may 
well be determined by factors such as social background and geographical location. These 
factors are also correlated with education, and could lead the observed correlation in 
spouses' education to be partly or entirely spurious. We have recognized above that the 
last latent class includes women with a higher chance to be married. The large and 
positive impact on birthweight of the third class may lead to assume a married mothers' 



^EU Council conclusions 7370/11 of 08/03/2011. 
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group with a positive assortative mating. Here, we indirectly investigate the latter effect 
by including in the finite mixture class SEM the father's educational level as a potential 
confounder of the birthweight outcome and evaluate the sensitivity with respect to the 



main relationship between education and birthweight. Tables |9] and 10 shows the results 
of these estimates. Noticeable, the contribution of education on birth outcome becomes 
less susceptible to ambiguous conclusions. Causal effect of mother's education with at 
least a degree are still significant and large enough to conclude over the goodness of the 
estimates, while it is even clearer that those with high school do not contribute to explain 
differences in the birthweight. Moreover, our findings appear first to support the idea 
that the mothers in class 3 have a positive assortative mating with a large effect on infant 
outcomes (compare 732 of Table 10 with its analogous of Table |8]), and to emphasize the 
robustness of our analysis, we note that all other estimated coefficients are close to the 
model presented in table [7j 



response var. 


covariate 


category 




est. 


s.e. 


t stat. 


p- value 


Gestational age 


intercept (z/i) 






39.473 


0.109 


362.26 


0.000 




education 


middle school or 


less 


0.000 










education 


high school 




0.012 


0.043 


0.27 


0.787 




education 


degree or above 




0.021 


0.053 


0.40 


0.690 




marital 


married 




0.000 










marital 


not married 




0.013 


0.034 


0.39 


0.699 


Birthweight 


intercept (z/2) 






3.263 


0.115 


28.54 


0.000 




education 


middle school or 


less 


0.000 










education 


high school 




0.015 


0.014 


1.11 


0.268 




education 


degree or above 




0.032 


0.017 


1.93 


0.053 




marital 


married 




0.000 










marital 


not married 




-0.007 


0.011 


-0.69 


0.489 



Table 9: Regression results for the gestational age and the birthweight controlled for fa- 
ther's educational level, age and citizenship 





k=l 


k=2 


k=3 


gestational age (Cfci) 
birthweight ((^2) 


0.173 
0.035 


-5.995 (0.000) 
-1.215 (0.000) 


0.172 (0.993) 
0.036 (0.993) 



Table 10: Support points estimates after controlling for father's educational level (p-values 
in parenthesis) 
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6 Conclusions 



The article presents new evidence on the child health increasing effect of education, using 
a finite mixture structural equation model (SEM) to identify a causal link. In particular, 
the estimation strategy controlling for the effects of marital status, observed and unob- 
served characteristics of the mother guarantees that health outcomes are entirely driven 
by differences in education. This is made possible by the inclusion of random parameters 
in each structural equation, which follow a discrete distribution with support points and 
weights estimated on the basis of the dataset. These support points then identify latent 
classes of individuals that allow us to adjust for unobserved confounding. 

We report empirical findings showing that high education of mothers increases the 
birthweight whereas gestational age is not affected. The estimated social saving from 
birthweight increase, implied by our estimates, are substantial if associated to married 
mothers with positive assortative mating, that we identify through the latent classes. 

The existence of a causal birthweight increasing effect of high education has potentially 
important implications for longer-term effort aimed at reducing the level of birthweight. 
Policies that incentive high school, as an investment in human capital, have significant 
potential to reduce the percentage of mothers that deliver children below the minimum 
threshold weight, by increasing skill levels in helping care and behaviors consistent with 
a healthy pregnancy. At the very least, our result confirm that improving education 
among young mothers should be viewed as a key policy to reduce costs of unhealthy child 
outcome, a finding that appear to be emphasized at the "family level" by the presence of 
a more educated husband. 
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